Moderation and Heterogeneous Effects

Our first steps in moderation with R

Welcome

Introduction!

Welcome to our ninth tutorial for the Statistics II: Statistical Modeling & Causal Inference (with R) course.

During this week’s lecture you were introduced to non-linear models and moderation and heterogeneous effects. Our focus on the practical component will be on the latter part.

In this lab session we will:

  • Learn how to perform interaction models with lm()
  • Learn how to extract marginal/partial effects with marginaleffects::avg_slopes() and margins::margins()
  • Learn how to extract predictive margins with marginaleffects::avg_predictions() andggeffects::::ggeffect()
  • Learn how to vectorize multiple ifelse() statements with dplyr::case_when()

Measuring the effect of an information intervention on peace agreement support

The country of Absurdistan has had an ongoing civil conflict for the past 60 years. The fighting between national government forces and guerrilla members from the National Revolutionary Army (NRA) has lead to more than 200,000 deaths, 8 million internally displaced persons, and countless victims between 1960 to 2020.

The civil war in Absurdistan has been a low-intensity asymmetric war. The legacies of the conflict have been bore largely by regions in the periphery of the country. Large cities and industrial regions have been spared from most of the fighting.

The government and the leadership of the NRA reached a peace agreement a couple of months ago; however, the agreement has been received poorly by the general population of Absurdistan. The opposition party the Undemocratic Center (UC) has allegedly ran campaigns misrepresenting the contents of the agreement in partisan media outlets and social media.

The Special Envoy for Peace has established a task-force to design a strategy to increase support for the peace agreement. Many in the task-force suspect that if the public were properly informed about the agreement reached, the levels of support would be higher.

You are hired as a scientific consultant for the task-force. You run a survey experiment on a sample of 1000 respondents. You randomly assign respondents to watch an educational 2 minute video about the peace agreement.


Packages

set.seed(42) #for consistent results

library(dplyr) # to wrangle our data
library(tidyr) # to wrangle our data - pivot_longer()
library(ggplot2) # to render our graphs
library(readr) # for loading the .csv data
library(janitor) # for data management and tabyl()
library(kableExtra) # to render better formatted tables
library(stargazer) # for formatting your model output
library(modelsummary) # for formatting your model output

library(margins) #for calculating MARGINAL/PARTIAL EFFECT
library(ggeffects) # easily calculating PREDICTIVE MARGINS
library(marginaleffects) #for calculating MARGINAL/PARTIAL EFFECT and PREDICTIVE MARGINS

Exploratory analysis

moderation_df <- readr::read_csv("https://raw.githubusercontent.com/seramirezruiz/hertiestats2/master/data/moderation_df.csv") # simulated data
names(moderation_df) # to check the names of the variables in our data
## [1] "subject_id"          "treatment"           "victim_verbose"     
## [4] "victim"              "female"              "female_verbose"     
## [7] "support_thermometer"

Our data set moderation_df, contains the following information:

  • subject_id: A unique numeric identification for each respondent
  • treatment: A binary marker for treatment
  • victim_verbose: A verbose binary marker of the respondents’ victimhood status (Not a Victim-Victim)
  • victim: A numeric binary marker of the respondents’ victimhood status (0-1)
  • female: A numeric binary marker of the respondents’ sex (0-1)
  • female_verbose: A verbose binary marker of the respondents’ sex (Male-Female)
  • support_thermometer: A continuous measure of support for the agreement (0-100)

Let’s explore who is in our sample

We can use what we have learned about the janitor::tabyl() function, to check who was in our sample:

moderation_df %>% 
  janitor::tabyl(treatment) %>% 
  knitr::kable(col.names = c("Treatment", "N", "Percent, %")) %>% # create kable table
  kableExtra::kable_styling() # view kable table
Treatment N Percent, %
0 500 0.5
1 500 0.5
moderation_df %>% 
  janitor::tabyl(treatment, victim_verbose) %>% 
  janitor::adorn_totals(c("row", "col")) %>%
  knitr::kable(col.names = c("Treatment", "Not a victim", "Victim", "Total")) %>% # create kable table
  kableExtra::kable_styling() %>% # view kable table
  kableExtra::add_header_above(c("", "Victimization status" = 2, ""))
Victimization status
Treatment Not a victim Victim Total
0 250 250 500
1 250 250 500
Total 500 500 1000

Let’s explore visually the observed levels of public support for the agreement

ggplot(moderation_df, aes(x = support_thermometer)) +
  geom_density(fill = "#af8dc3", alpha = 0.5) +
  theme_minimal() +
  labs(x = "Support thermometer",
       y = "Density")


What do we see in this graph?


Let’s explore visually the observed levels of public support for the agreement conditional on the treatment status

ggplot(moderation_df, aes(x = support_thermometer, fill = factor(treatment))) +
  geom_density(alpha = 0.5) +
  theme_minimal() +
  labs(x = "Support thermometer",
       y = "Density",
       fill = "Treatment") +
  theme(legend.position = "bottom") +
  scale_fill_manual(name = " ", # changes to fill dimension
                    values = c("#a7a8aa", "#cc0055"),
                    labels = c("Control", "Treatment")) 


What do we see through these distributions?


Let’s explore visually the observed levels of public support for the agreement conditional on the treatment and victimhood status

ggplot(moderation_df, aes(x = support_thermometer, fill = factor(treatment))) +
  geom_density(alpha = 0.5) +
  facet_grid(treatment~victim_verbose) +
  theme_bw() +
  labs(x = "Support thermometer",
       y = "Density",
       fill = "Treatment") +
  theme(legend.position = "bottom") +
  scale_fill_manual(name = " ", # changes to fill dimension
                    values = c("#a7a8aa", "#cc0055"),
                    labels = c("Control", "Treatment")) 


What patterns do we see here?


Modeling and estimating

During this week’s lecture, we learned that we can explicitly model heterogeneity in treatment effects for subgroups. Thus, we can address the tension between having to do inference at the group level, and the recognition of individual differences.

The analysis of heterogeneity can be very important for the design of our strategy. There are competing theories about the effects of conflict victimization on political behavior and attitudes. Some of the literature points to victims developing pro-social attitudes, while others suggest that victims become intransigent towards out-groups.

Could factual information about the peace agreement be received differently by victims and non-victims?


How to estimate heterogenous treatment effects

Heterogeneous treatment effects are usually estimated with regression models that include an interaction between the treatment and the moderator.

In our case, the formula would look like this:

\[Y_{i} = β_0 + β_1D_{i} + β_2Victim_{i} + β_3D_i * Victim_{i}+ ϵ_i\]

  • \(β_0\): Constant
  • \(β_1\): Effect of \(D_i\) on \(Y_i\) if \(Victim_i\) is zero
  • \(β_2\): Effect of \(Victim_i\) on \(Y_i\) if \(D_i\) is zero
  • \(β_3\): Difference in treatment effects of \(D_i\) depending on \(Victim_i\)

or in lm() terms (same result):

lm(outcome ~ treatment + moderator + (treatment*moderator))
lm(outcome ~ treatment + moderator + treatment:moderator)
lm(outcome ~ treatment * moderator)

Think of the switch logic


Let’s model our results

We will move forward by creating two models:

  • A pooled model, where we will regress support_thermometer on treatment to gather the ATE.
  • An interaction model, where we will add an interaction between treatment and victim to explore the moderating effect of victimhood.

Pooled modeling

ate_model <- lm(support_thermometer ~ treatment, data = moderation_df)
modelsummary::modelsummary(ate_model, 
                           fmt = 2,
                           gof_omit = "AIC|BIC|Log.Lik.",
                           statistic = "conf.int",
                           stars = T)
Model 1
(Intercept) 35.05***
[33.63, 36.48]
treatment 18.90***
[16.88, 20.91]
Num.Obs. 1000
R2 0.254
R2 Adj. 0.253
F 339.360
RMSE 16.20
+ p

What does this model tell us?

The pooled model suggests that the subjects that the support for the peace agreement is about 18.9 percentage points higher on average for the subjects that watched the educational video. Given the randomized delivery of the treatment, we would expect this estimate to represent the average treatment effect (ATE). We suspect, however, that the video may affect differently victims from non-victims of the conflict.


Here is a visual illustration of the values rendered by the naive regression

ggplot(moderation_df, aes(x = support_thermometer, fill = factor(treatment))) +
  geom_density(alpha = 0.5) +
  theme_minimal() +
  geom_vline(xintercept = 35.055, linetype = "longdash", color = "#a7a8aa") + #D=0 (just beta0)
  geom_vline(xintercept = 35.055 + 18.898, linetype = "longdash", color = "#cc0055") + #D=1 (beta0+beta1) +
  geom_text(aes(label = "ß0", x = 35.055 + 3, y = 0.04), color = "#a7a8aa") + # we add the 3 to repel from the line
  geom_text(aes(label = "ß0 + ß1", x = 35.055 + 18.898 + 6, y = 0.04 ), color = "#cc0055") + # we add the 6 to repel from the line
  labs(x = "Support thermometer",
       y = "Density",
       fill = "Treatment") +
  theme(legend.position = "bottom") +
  scale_fill_manual(name = " ", # changes to fill dimension
                    values = c("#a7a8aa", "#cc0055"),
                    labels = c("Control", "Treatment")) 


Interaction model

Following any of the different formats renders the same results.

# All these formulations get us to the same place
interaction_model <- lm(support_thermometer ~ treatment + victim + (treatment*victim), data = moderation_df)
interaction_model_2 <- lm(support_thermometer ~ treatment + victim + treatment:victim, data = moderation_df)
interaction_model_3 <- lm(support_thermometer ~ treatment*victim, data = moderation_df)
modelsummary::modelsummary(list(interaction_model, interaction_model_2, interaction_model_3), 
                           fmt = 2,
                           gof_omit = "AIC|BIC|Log.Lik.",
                           statistic = "conf.int",
                           stars = T)
Model 1 Model 2 Model 3
(Intercept) 27.93*** 27.93*** 27.93***
[26.85, 29.00] [26.85, 29.00] [26.85, 29.00]
treatment 8.01*** 8.01*** 8.01***
[6.49, 9.53] [6.49, 9.53] [6.49, 9.53]
victim 14.26*** 14.26*** 14.26***
[12.73, 15.78] [12.73, 15.78] [12.73, 15.78]
treatment × victim 21.77*** 21.77*** 21.77***
[19.62, 23.92] [19.62, 23.92] [19.62, 23.92]
Num.Obs. 1000 1000 1000
R2 0.787 0.787 0.787
R2 Adj. 0.786 0.786 0.786
F 1227.193 1227.193 1227.193
RMSE 8.66 8.66 8.66
+ p

What does this model tell us?

  • \(β_0\): Constant = The average support for non-victims who were not exposed to the video was 27.92
  • \(β_1\): Effect of \(D_i\) on \(Y_i\) if \(Victim_i\) is zero = The educational video results in an increase of about 8 percentage points of the peace agreement for the non-victims
  • \(β_2\): Effect of \(Victim_i\) on \(Y_i\) if \(D_i\) is zero = On average, victims’ support for the peace agreement is 14.3 percentage points higher than that of the non-victims in the control group
  • \(β_3\): Difference in treatment effects of \(D_i\) depending on \(Victim_i\) = The educational video results in an additional increase for victims of about 21.8 percentage points, compared to the effect for non-victims

Here is a visual illustration of the values rendered by the model with interaction terms

moderation_df %>%
  dplyr::group_by(treatment,victim_verbose) %>%
  dplyr::mutate(avg_support = mean(support_thermometer)) %>%
  ggplot(., aes(x = support_thermometer, fill = factor(treatment))) +
  geom_density(alpha = 0.5) +
  geom_vline(aes(xintercept = avg_support), linetype = "longdash") +
  facet_grid(treatment~victim_verbose) +
  theme_bw() +
  labs(x = "Support thermometer",
       y = "Density",
       fill = "Treatment") +
  theme(legend.position = "bottom") +
  scale_fill_manual(name = " ", # changes to fill dimension
                    values = c("#a7a8aa", "#cc0055"),
                    labels = c("Control", "Treatment"))


Extracting marginal/partial effects from our interaction models

For this portion, we are interested in marginal/partial effects, which we will extract through the avg_slopes() from the marginaleffects package and margins() function from the margins package.

Some packages in R aimed at rendering marginal effects do render the predictive margins instead. For the purposes of the class, when asked to render marginal or partial effects you are expected to render them as introduced in the lecture (i.e., \(\frac{\partial Y_i}{\partial {D}_i}\)). When asked to provide Group-Average Marginal Effects (G-AME), you will want to employ these.

You can check the documentation for this functions here and here. We recommend using the marginaleffects package by Vincent Arel-Bundock, as it is the better documented and more up-to-date suite for this use cases. The syntax for the functions is pretty similar:

marginaleffects::avg_slopes(name_of_your_model, variables= "treatment_", by = "moderator_variable")
margins::margins(name_of_your_model, variables = "treatment_var", at = list(moderator_variable = c("value1", "value2", "value3"))

Let’s say we are interested in the marginal/partial effect of our video treatment for victims and non-victims. We would do:

marginaleffects::avg_slopes(interaction_model,variables= "treatment", by = "victim")
## 
##  victim Estimate Std. Error    z Pr(>|z|)    S 2.5 % 97.5 %
##       0     8.01      0.776 10.3   <0.001 80.7  6.49   9.53
##       1    29.78      0.776 38.4   <0.001  Inf 28.26  31.30
## 
## Term: treatment
## Type:  response 
## Comparison: 1 - 0
summary(margins::margins(interaction_model, variables = "treatment", at = list(victim = c(0,1)))) #note the summary(), see what happens when you run it without it
##     factor victim     AME     SE       z      p   lower   upper
##  treatment 0.0000  8.0125 0.7757 10.3288 0.0000  6.4921  9.5330
##  treatment 1.0000 29.7831 0.7758 38.3895 0.0000 28.2626 31.3037

Let’s plot the marginal/partial effect of the treatment for victims and non-victims

# Let's save this into an object to understand how it is structured compared to the console output
pe_margins <- marginaleffects::avg_slopes(interaction_model,variables= "treatment", by = "victim")

pe_plotting <- tibble::tibble(pe_margins) %>%
  dplyr::select(victim, estimate, conf.low, conf.high) %>% # you will need to adapt this based your moderator
  dplyr::mutate(victim_labels = ifelse(victim == 1, "Victim", "No Victim")) # you may not need this to create labels for the assignment

ggplot(pe_plotting, aes(x = victim_labels,
                        y = estimate)) +
  geom_point(size = 1.5) +
  geom_segment(aes(x = victim_labels, xend = victim_labels, y = conf.low, yend = conf.high), size = 0.5) + # render whiskers from confidence intervals
  theme_minimal() +
  scale_y_continuous(limits = c(0,45)) + # you may need to change the limits for your plots based on the specific effects of your application
  labs(x = "Respondent status\n",
       y = "\nGroup average marginal effect of educational video", 
       caption = "Note: You can utilize it to describe what the plot illustrates during your assignment.") + 
  coord_flip() # to flip the plot


Extracting predictive margins from our interaction models

For this portion, we are interested in predictive margins. In here, our interest is to return the expectation for each level of a predictor. We will extract this through the avg_predictions() function from the marginaleffects package and ggeffects() function from the ggeffects package. You can check the documentation here and here. The syntax is the following:

marginaleffects::avg_predictions(name_of_your_model, variables = c("termA", "termB"))
ggeffects::ggeffect(name_of_your_model, terms = c("termA", "termB"))

Let’s say we are interested in the predictive margins for all out victim and treatment permutations. We would do:

marginaleffects::avg_predictions(interaction_model, variables = c("victim","treatment"))
## 
##  victim treatment Estimate Std. Error     z Pr(>|z|)   S 2.5 % 97.5 %
##       0         0     27.9      0.549  50.9   <0.001 Inf  26.9   29.0
##       0         1     35.9      0.549  65.5   <0.001 Inf  34.9   37.0
##       1         0     42.2      0.549  76.9   <0.001 Inf  41.1   43.3
##       1         1     72.0      0.549 131.2   <0.001 Inf  70.9   73.0
## 
## Type:  response
ggeffects::ggeffect(interaction_model, terms = c("victim","treatment"))
## # Predicted values of support_thermometer
## 
## # treatment = 0
## 
## victim | Predicted |         95% CI
## -----------------------------------
##      0 |     27.93 | [26.85, 29.00]
##      1 |     42.18 | [41.11, 43.26]
## 
## # treatment = 1
## 
## victim | Predicted |         95% CI
## -----------------------------------
##      0 |     35.94 | [34.86, 37.02]
##      1 |     71.97 | [70.89, 73.04]

Let’s plot these predictive margins

In this exercise we will meet a very important function from dplyr, dplyr::case_when(). This function allows us to vectorize multiple ifelse() statements. The syntax is the following:

dplyr::case_when(condition ~ what to do if met)

Let’s see it at play.

# Let's save this into an object to understand how it is structured compared to the console output
pred_marg <-  tibble::tibble(marginaleffects::avg_predictions(interaction_model, variables = c("victim","treatment")))

pred_marg_plotting <- pred_marg %>%
  dplyr::mutate(group_labels = dplyr::case_when(victim == 1 & treatment == 1 ~ "Victim (1) - Treat (1)",
                                                victim == 1 & treatment == 0 ~ "Victim (1) - Treat (0)",
                                                victim == 0 & treatment == 1 ~ "Victim (0) - Treat (1)",
                                                victim == 0 & treatment == 0 ~ "Victim (0) - Treat (0)"
  )) # adding labels to (victim,treatment) combinations for the plot

pred_marg_plotting
## # A tibble: 4 × 10
##   victim treatment estimate std.error statistic p.value s.value conf.low
##    <int>     <int>    <dbl>     <dbl>     <dbl>   <dbl>   <dbl>    <dbl>
## 1      0         0     27.9     0.549      50.9       0     Inf     26.9
## 2      0         1     35.9     0.549      65.5       0     Inf     34.9
## 3      1         0     42.2     0.549      76.9       0     Inf     41.1
## 4      1         1     72.0     0.549     131.        0     Inf     70.9
## # ℹ 2 more variables: conf.high <dbl>, group_labels <chr>
ggplot(pred_marg_plotting, aes(x = group_labels, 
                               y = estimate,
                               color = factor(treatment),
                               shape = factor(treatment))) + 
  geom_point(size = 1.5) +
  geom_segment(aes(x = group_labels, 
                   xend = group_labels, 
                   y = conf.high, 
                   yend = conf.low), size = 0.5) + # render whiskers from confidence intervals
  theme_minimal() +
  labs(x = "\nRespondent status",
       y = "Predictive margins\n(Support thermometer)",
       caption = "Note: You can utilize it to describe what the plot illustrates during your assignment.") +
  theme(legend.position = "none") +
  scale_color_manual(values=c("#000000", "#800080")) +
  scale_shape_manual(values = c(19,15))


Drafting some brief recommedations

After conducting your experiment, you report back to the task-force. Based on your results, you suggest that the educational videos may be a useful tool to encourage the wider public to hold a warmer opinion about the peace agreement. You also tell the task-force that, although the videos have an average positive effect, they affect with a higher intensity victims of the conflict. You suggest to develop alternative strategies to tackle the non-victims, so that they do not fall through the cracks.


A recommendation that relates to nonlinear models and moderation

The world is a complex place, and making sense of it often requires the use of advanced statistical models. In this class, we have engaged with a couple of them. In the policy context, you will need to interpret the results of such models and communicate your findings clearly. These skills are essential, as we normally face challenges when trying to explain their analyses to colleagues and stakeholders.

We recommend this webpage/book: Model to Meaning: How to Interpret Statistical Models with marginaleffects for R and Python by Vincent Arel-Bundock, Noah Greifer, and Andrew Heiss

This is a terrific resource of useful information when it comes to considering quantities of interest and interpreting model output.


Acknowledgements

This script was drafted by Lisa Oswald and Sebastian Ramirez-Ruiz, with contributions by Adelaida Barrera Daza, Carolina Díaz Suárez, Sofía García-Durrer, and William Fernandez.